Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Middleware-based compression and decompression #194

Merged
merged 5 commits into from
May 16, 2019
Merged

Middleware-based compression and decompression #194

merged 5 commits into from
May 16, 2019

Conversation

fairingrey
Copy link
Contributor

@fairingrey fairingrey commented Apr 28, 2019

Provides compression middleware for outgoing HTTP responses and decompression middleware for incoming HTTP requests.

Description

These two middlewares do the following:

  • Automatically handle incoming request decompression for Content-Encoding headers that contain one or more applications of gzip, deflate, br, or zstd.
  • Encode outgoing responses with preferred encoding behavior for any incoming request with a valid Accept-Encoding header.
  • Allow one to define compression levels for each compression algorithm available.

The decompression middleware actually depends on getting a mutable handle on the request from the context, hence why a new function was added here.

Motivation and Context

This closes #26.

Tide currently has no way of handling encoded requests or handling outgoing request compression. This middleware is one approach to solve that problem, although there exist a multitude of approaches that may work. This middleware may also serve as a point of discussion for rustasync/team#52.

How Has This Been Tested?

Tests are written! They're not very thorough, but that's because it's only to test their functionality. More rigorous tests for each of the stream encoders/decoders are in the async-compression crate.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@fairingrey
Copy link
Contributor Author

fairingrey commented Apr 28, 2019

EDIT: removed old stuff

EDIT 2: Following the stable core isolation, it now lives in its own crate.

src/middleware/compression.rs Outdated Show resolved Hide resolved
src/middleware/compression.rs Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

fairingrey commented Apr 29, 2019

EDIT: No longer applies -- we got compression middleware working!

@fairingrey
Copy link
Contributor Author

fairingrey commented Apr 29, 2019

By the way, thanks to everyone here for helping review and contribute to this PR!

I'll try to bring it up during the next async WG meeting, but currently async compression works thanks to Nemo's help and the now new existence of the adapter crate at https://github.com/rustasync/async-compression-rs.

@fairingrey
Copy link
Contributor Author

Updated my comments to reflect my current status. Once #193 gets merged, I can start writing some tests for compressing Tide request bodies.

@fairingrey
Copy link
Contributor Author

fairingrey commented May 1, 2019

One other concern I also had was if the request should also upsert the Vary header with value Accept-Encoding as following https://developer.mozilla.org/en-US/docs/Web/HTTP/Compression#End-to-end_compression. Might be a necessary detail if we're going by what it says there.

@fairingrey
Copy link
Contributor Author

fairingrey commented May 3, 2019

Phew... Finished all of the logic for compression and decompression handling over streams, but both implementations aren't tested yet because of current blockers over the outdated futures API on the latest nightly (deny warnings).

One other concern I had (and this is mentioned in my commit message) is that there is currently no way to get a &mut Request handle from the Context. This is another blocker since to decompress the request into plain data it's necessary to alter the incoming request being passed in.

I'll try to bring this concern up during the next meeting, but otherwise I think I've made a lot of progress! It would be good to get more review now while it's appropriate, I think.

Also, Nullus157/async-compression#10 will need to be merged before I can use it. Well, after it's also published to crates.io once things are sorted out there.

I guess now I should work on tests, but because futures isn't updated yet (and the deny warnings flag is enabled), it seems a bit painful right now to.

@kdar
Copy link

kdar commented May 3, 2019

This could be due to my lack of experience with futures, but when I try to use this with hyper 0.12, I can only ever get it to spit out the first 8000 bytes. Even if I try to block and concat/collect the stream, it still only returns the first 8000 bytes.

Here is how I'm using it in hyper:

use std::io;
use async_compression::stream::deflate;
use futures::{future, stream::Stream};
use futures03::{compat::Stream01CompatExt, stream::TryStreamExt};

let (mut parts, mut body) = response.into_parts();
let compressed = deflate::DeflateStream::new(
  body
    .map(|v| v.into_bytes())
    .map_err(|e| io::Error::new(io::ErrorKind::Other, e))
    .compat(),
  flate2::Compression::default(),
);

body = Body::wrap_stream(compressed.compat());
let response = hyper::Response::from_parts(parts, body);

This ends up setting transfer-encoding to chunked, and only returning 8000 bytes.
I then tried this craziness to see if I could just grab everything from the stream and put it into a bytes::Bytes, but it still only returned 8000 bytes.

use futures03::executor::block_on;
use futures03::stream::StreamExt;
use hyper::body::Payload;
let payload: Vec<_> = block_on(compressed.map(|v| v.unwrap()).collect());
let mut fin: Vec<_> = vec![];
for b in payload {
  fin.put(b);
}
body = Body::from(fin);

I just want to confirm whether this is my issue or not.

@fairingrey
Copy link
Contributor Author

fairingrey commented May 3, 2019

@kdar I'd recommend you open an issue in the async-compression repo and provide your code or steps to reproduce, and I can try helping you there. While this PR is sort of related, this is mostly for tracking progress on this middleware. But perhaps I can figure out a bug from what you're doing.

8000 bytes is the size of the internal buffer for those compression streams though, so if something weird is happening it's likely related to that.

EDIT: Issue should be fixed now since Nullus157/async-compression#11 was merged

Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

I'll be writing tests for the compression middleware in the meantime I'm looking to get the issue I'm having unblocked. Will likely start tomorrow, since I have some stuff to do today.

@fairingrey
Copy link
Contributor Author

I'll be moving some of my Accept-Encoding parsing logic to https://github.com/rustasync/accept-encoding under the recommendation of yoshua -- hopefully shouldn't take too long 😅

@fairingrey
Copy link
Contributor Author

fairingrey commented May 12, 2019

Currently waiting on both http-rs/accept-encoding#2 and Nullus157/async-compression#13 until I continue progress.

EDIT: Squashed all my commits into one and rebased off master. Going to start writing docs and tests now or later this week.

EDIT 2: Finished!

@fairingrey fairingrey marked this pull request as ready for review May 12, 2019 23:54
Copy link
Contributor

@pickfire pickfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Cargo.toml Outdated Show resolved Hide resolved
@fairingrey fairingrey changed the title Middleware-based compression and decompression handling Middleware-based compression and decompression May 13, 2019
@tirr-c
Copy link
Collaborator

tirr-c commented May 13, 2019

How do you think about using Encoding and Option<Encoding> accordingly? Currently we have Encoding::Any (or Encoding::None) to indicate "no specific preference", but it seems like Encoding::Any is often special-cased.

@fairingrey
Copy link
Contributor Author

fairingrey commented May 13, 2019

Oh, that's a good approach. Using Option<Encoding> actually hadn't crossed my mind while writing it, but I think it'd look pretty decent.

I can rewrite that part in the Accept-Encoding crate such that when its' re-exported here I don't have to special-case it. I'll try it in a moment.

EDIT: Done

@prasannavl
Copy link
Contributor

@fairingrey - This is great! Thank you for all the hard work! :)

Some thoughts:

  • One tiny little thing I'd have done is do this as a separate PR: https://github.com/rustasync/tide/pull/194/files#diff-e0934bb6f70a4f8c3baf22a773c14ec7
  • The git dependencies - I think in the long run, whichever can make it into the main Cargo.toml as patch-crate.io as git master on rustasync org owned reasonably stable libs should be fine. Other than that, I think a good way to move forward for early deps would be to use a pinned branch (say tide), rather than master - so code doesn't inadvertently get changed -- I say this since git deps will pass through the tide CI/CD, and there would be no way catch these, since they can be updated separately and end up wreaking havoc depending on when the user actually downloads the cargo crates.

@fairingrey
Copy link
Contributor Author

fairingrey commented May 14, 2019

@prasannavl Thank you!

On the topic of Cargo.toml, I could have, probably -- but it was just a small thing and I was including new dependencies anyway 😅

As for the git dependencies, I agree. The async-compression-rs and accept-encoding crates likely won't be changed for a while (aside from non-breaking changes that Nemo or I am making), but it would be in our best interests to pin a stable branch we know won't be changing without the necessary PRs to update them.

EDIT: (Note to self, if and/or when #219 gets merged, consider moving this middleware into its own crate tide-compression that sits at the top-level workspace alongside tide and tide-log)

@fairingrey
Copy link
Contributor Author

fairingrey commented May 14, 2019

Actually, lemme make one tiny new change to decrease compilation time (since this middleware doesn't use the bufread features in async-compression, among other things).

Copy link
Collaborator

@tirr-c tirr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go one step further 🙂

src/middleware/compression.rs Outdated Show resolved Hide resolved
src/middleware/compression.rs Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

fairingrey commented May 15, 2019

Since #219 was merged, looks like I'll be doing a rebase now.

Thanks for all the review and feedback!

EDIT: Re-requesting for final final review. 😅 💦

Copy link
Collaborator

@tirr-c tirr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to make the example at Readme into an example in tide-compression/examples/.

tide-compression/README.md Outdated Show resolved Hide resolved
tide-compression/README.md Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

Actually, I'll probably move the example out to the examples folder, thanks for the suggestion 👍

@fairingrey
Copy link
Contributor Author

fairingrey commented May 15, 2019

@tirr-c Sorry about that! How about now?

EDIT: Neither of the given example commands actually test for decompression... hrm... Dunno if there's an easy way to test. cURL doesn't really support it.

Copy link
Collaborator

@tirr-c tirr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one more thing... This review ping-pong became lengthy than what I've expected 😅 but every other thing seems okay for me!

tide-compression/README.md Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

Indeed... There's a lot of stuff here, and this is a fairly large component -- it's good to get lots of review in so we won't have to double over it so much when it's integrated 😅

But I think this is good now! It should be good to merge tomorrow, hopefully 💦

Copy link
Collaborator

@tirr-c tirr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the example server just closes the connection when invalid request body (that can't be decompressed) is given. The fix can be in another PR though.

Again, great work! 👍 👍

version = "0.1.0"

[dependencies]
tide = { path = "../tide" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a version field in order to publish this crate. Let's do this after structure revamp!

Copy link
Contributor

@prasannavl prasannavl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in great shape! 🥇
Thank you @fairingrey! :D

/aside #222 (comment), might require some policy based changes, but I see no reason to hold this PR for that.

Cargo.toml Outdated Show resolved Hide resolved
@fairingrey
Copy link
Contributor Author

fairingrey commented May 16, 2019

Thanks for all the review, everyone! I'll get this merged now since it looks like we have not much else to change.

If anything else needs to be sharpened up, we can do it through a future PR, I imagine.

Will be writing a blog post about this in the near future as follows the meeting notes: https://github.com/rustasync/team/blob/master/meetings/2019-05-16.md

@fairingrey fairingrey merged commit d64d063 into http-rs:master May 16, 2019
@fairingrey fairingrey deleted the compression branch May 16, 2019 18:41
@fairingrey fairingrey mentioned this pull request May 16, 2019
8 tasks
@yoshuawuyts yoshuawuyts mentioned this pull request Nov 3, 2019
yoshuawuyts added a commit that referenced this pull request Nov 3, 2019
Signed-off-by: Yoshua Wuyts <yoshuawuyts@gmail.com>
yoshuawuyts added a commit that referenced this pull request Nov 3, 2019
Signed-off-by: Yoshua Wuyts <yoshuawuyts@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Middleware for compression
7 participants